{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "tn-NTlKI2EYR"
   },
   "source": [
    "# Semantic reranking with Elasticsearch and Hugging Face\n",
    "\n",
    "_Authored by: [Liam Thompson](https://github.com/leemthompo)_\n",
    "\n",
    "In this notebook we will learn how to implement semantic reranking in Elasticsearch by uploading a model from Hugging Face into an Elasticsearch cluster. We'll use the `retriever` abstraction, a simpler Elasticsearch syntax for crafting queries and combining different search operations.\n",
    "\n",
    "You will:\n",
    "\n",
    "- Choose a cross-encoder model from Hugging Face to perform semantic reranking\n",
    "- Upload the model to your Elasticsearch deployment using [Eland](https://www.elastic.co/guide/en/elasticsearch/client/eland/current/machine-learning.html)— a Python client for machine learning with Elasticsearch \n",
    "- Create an inference endpoint to manage your `rerank` task\n",
    "- Query your data using the `text_similarity_rerank` retriever"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "-tir9w4Sz80v"
   },
   "source": [
    "## 🧰 Requirements\n",
    "\n",
    "For this example, you will need:\n",
    "\n",
    "- An Elastic deployment on version 8.15.0 or above (for non-serverless deployments)\n",
    "    \n",
    "    - We'll be using Elastic Cloud for this example (available with a [free trial](https://cloud.elastic.co/registration)).\n",
    "    - See our other [deployment options](https://www.elastic.co/guide/en/elasticsearch/reference/current/elasticsearch-intro.html#elasticsearch-intro-deploy)\n",
    "-  You'll need to find your deployment's Cloud ID and create an API key. [Learn more](https://www.elastic.co/search-labs/tutorials/install-elasticsearch/elastic-cloud#finding-your-cloud-id).\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "Ut-7TfSB2EYS"
   },
   "source": [
    "## Install and import packages\n",
    "\n",
    "ℹ️ The `eland` installation will take a couple of minutes."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "AzuZInsj2EYS"
   },
   "outputs": [],
   "source": [
    "!pip install -qU elasticsearch\n",
    "!pip install eland[pytorch]\n",
    "from elasticsearch import Elasticsearch, helpers"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "FlJXqZkJ2EYT"
   },
   "source": [
    "## Initialize Elasticsearch Python client\n",
    "\n",
    "First you need to connect to your Elasticsearch instance."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/"
    },
    "id": "frMDUtzt2EYT",
    "outputId": "6339d445-e5ef-41e4-83f2-100b58875c27"
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Elastic Cloud ID: ··········\n",
      "Elastic Api Key: ··········\n"
     ]
    }
   ],
   "source": [
    "from getpass import getpass\n",
    "\n",
    "# https://www.elastic.co/search-labs/tutorials/install-elasticsearch/elastic-cloud#finding-your-cloud-id\n",
    "ELASTIC_CLOUD_ID = getpass(\"Elastic Cloud ID: \")\n",
    "\n",
    "# https://www.elastic.co/search-labs/tutorials/install-elasticsearch/elastic-cloud#creating-an-api-key\n",
    "ELASTIC_API_KEY = getpass(\"Elastic Api Key: \")\n",
    "\n",
    "# Create the client instance\n",
    "client = Elasticsearch(\n",
    "    # For local development\n",
    "    # hosts=[\"http://localhost:9200\"]\n",
    "    cloud_id=ELASTIC_CLOUD_ID,\n",
    "    api_key=ELASTIC_API_KEY,\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "rL5AEaDrzj9t"
   },
   "source": [
    "## Test connection\n",
    "\n",
    "Confirm that the Python client has connected to your Elasticsearch instance with this test.\n",
    "\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "t_lhez7Tznhp"
   },
   "outputs": [],
   "source": [
    "print(client.info())"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "yR-zE2ii2EYU"
   },
   "source": [
    "This examples uses a small dataset of movies."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/"
    },
    "id": "bWEpw_X52EYU",
    "outputId": "e4faa762-cb9d-487f-9055-54b2e4476a53"
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Done indexing documents into `movies` index!\n"
     ]
    }
   ],
   "source": [
    "from urllib.request import urlopen\n",
    "import json\n",
    "import time\n",
    "\n",
    "url = \"https://huggingface.co/datasets/leemthompo/small-movies/raw/main/small-movies.json\"\n",
    "response = urlopen(url)\n",
    "\n",
    "# Load the response data into a JSON object\n",
    "data_json = json.loads(response.read())\n",
    "\n",
    "# Prepare the documents to be indexed\n",
    "documents = []\n",
    "for doc in data_json:\n",
    "    documents.append(\n",
    "        {\n",
    "            \"_index\": \"movies\",\n",
    "            \"_source\": doc,\n",
    "        }\n",
    "    )\n",
    "\n",
    "# Use helpers.bulk to index\n",
    "helpers.bulk(client, documents)\n",
    "\n",
    "print(\"Done indexing documents into `movies` index!\")\n",
    "time.sleep(3)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "AWd_YOZh2EYU"
   },
   "source": [
    "## Upload Hugging Face model using Eland\n",
    "\n",
    "Now we'll use Eland's `eland_import_hub_model` command to upload the model to Elasticsearch. For this example we've chosen the `cross-encoder/ms-marco-MiniLM-L-6-v2` text similarity model."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/"
    },
    "id": "sra93pWm2EYV",
    "outputId": "a7bb15dc-bb43-492e-f92e-dd1d78711fb8"
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "2024-08-13 17:04:12,386 INFO : Establishing connection to Elasticsearch\n",
      "2024-08-13 17:04:12,567 INFO : Connected to serverless cluster 'bd8c004c050e4654ad32fb86ab159889'\n",
      "2024-08-13 17:04:12,568 INFO : Loading HuggingFace transformer tokenizer and model 'cross-encoder/ms-marco-MiniLM-L-6-v2'\n",
      "/usr/local/lib/python3.10/dist-packages/huggingface_hub/file_download.py:1132: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`.\n",
      "  warnings.warn(\n",
      "tokenizer_config.json: 100% 316/316 [00:00<00:00, 1.81MB/s]\n",
      "config.json: 100% 794/794 [00:00<00:00, 4.09MB/s]\n",
      "vocab.txt: 100% 232k/232k [00:00<00:00, 2.37MB/s]\n",
      "special_tokens_map.json: 100% 112/112 [00:00<00:00, 549kB/s]\n",
      "pytorch_model.bin: 100% 90.9M/90.9M [00:00<00:00, 135MB/s]\n",
      "STAGE:2024-08-13 17:04:15 1454:1454 ActivityProfilerController.cpp:312] Completed Stage: Warm Up\n",
      "STAGE:2024-08-13 17:04:15 1454:1454 ActivityProfilerController.cpp:318] Completed Stage: Collection\n",
      "STAGE:2024-08-13 17:04:15 1454:1454 ActivityProfilerController.cpp:322] Completed Stage: Post Processing\n",
      "2024-08-13 17:04:18,789 INFO : Creating model with id 'cross-encoder__ms-marco-minilm-l-6-v2'\n",
      "2024-08-13 17:04:21,123 INFO : Uploading model definition\n",
      "100% 87/87 [00:55<00:00,  1.57 parts/s]\n",
      "2024-08-13 17:05:16,416 INFO : Uploading model vocabulary\n",
      "2024-08-13 17:05:16,987 INFO : Starting model deployment\n",
      "2024-08-13 17:05:18,238 INFO : Model successfully imported with id 'cross-encoder__ms-marco-minilm-l-6-v2'\n"
     ]
    }
   ],
   "source": [
    "!eland_import_hub_model \\\n",
    "  --cloud-id $ELASTIC_CLOUD_ID \\\n",
    "  --es-api-key $ELASTIC_API_KEY \\\n",
    "  --hub-model-id cross-encoder/ms-marco-MiniLM-L-6-v2 \\\n",
    "  --task-type text_similarity \\\n",
    "  --clear-previous \\\n",
    "  --start"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "DRIABkGAgV_Q"
   },
   "source": [
    "## Create inference endpoint\n",
    "\n",
    "Next we'll create an inference endpoint for the `rerank` task to deploy and manage our model and, if necessary, spin up the necessary ML resources behind the scenes."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 20,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/"
    },
    "id": "DiKsd3YygV_Q",
    "outputId": "c3c46c6b-b502-4167-c98c-d2e2e0a4613c"
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "ObjectApiResponse({'endpoints': [{'model_id': 'my-msmarco-minilm-model', 'inference_id': 'my-msmarco-minilm-model', 'task_type': 'rerank', 'service': 'elasticsearch', 'service_settings': {'num_allocations': 1, 'num_threads': 1, 'model_id': 'cross-encoder__ms-marco-minilm-l-6-v2'}, 'task_settings': {'return_documents': True}}]})"
      ]
     },
     "execution_count": 20,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "client.inference.put(\n",
    "    task_type=\"rerank\",\n",
    "    inference_id=\"my-msmarco-minilm-model\",\n",
    "    inference_config={\n",
    "        \"service\": \"elasticsearch\",\n",
    "        \"service_settings\": {\n",
    "            \"model_id\": \"cross-encoder__ms-marco-minilm-l-6-v2\",\n",
    "            \"num_allocations\": 1,\n",
    "            \"num_threads\": 1,\n",
    "        },\n",
    "    },\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Run the following command to confirm your inference endpoint is deployed."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "client.inference.get()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "\n",
    "⚠️ When you deploy your model, you might need to sync your ML saved objects in the Kibana (or Serverless) UI.\n",
    "Go to **Trained Models** and select **Synchronize saved objects**."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "pDAB9pX3VGKE"
   },
   "source": [
    "## Lexical queries\n",
    "\n",
    "First let's use a `standard` retriever to test out some lexical (or full-text) searches and then we'll compare the improvements when we layer in semantic reranking.\n",
    "\n",
    "### Lexical match with `query_string` query\n",
    "\n",
    "Let's say we vaguely remember that there is a famous movie about a killer who eats his victims. For the sake of argument, pretend we've momentarily forgotten the word \"cannibal\".\n",
    "\n",
    "Let's perform a [`query_string` query](https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-query-string-query.html) to find the phrase \"flesh-eating bad guy\" in the `plot` fields of our Elasticsearch documents."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 21,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/"
    },
    "id": "EAG0yZ_bVX-j",
    "outputId": "a8446bbf-ece5-47db-c7cc-ce9856ab9811"
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "No search results found\n"
     ]
    }
   ],
   "source": [
    "resp = client.search(\n",
    "    index=\"movies\",\n",
    "    retriever={\n",
    "        \"standard\": {\n",
    "            \"query\": {\n",
    "                \"query_string\": {\n",
    "                    \"query\": \"flesh-eating bad guy\",\n",
    "                    \"default_field\": \"plot\",\n",
    "                }\n",
    "            }\n",
    "        }\n",
    "    },\n",
    ")\n",
    "\n",
    "if resp[\"hits\"][\"hits\"]:\n",
    "    for hit in resp[\"hits\"][\"hits\"]:\n",
    "        title = hit[\"_source\"][\"title\"]\n",
    "        plot = hit[\"_source\"][\"plot\"]\n",
    "        print(f\"Title: {title}\\nPlot: {plot}\\n\")\n",
    "else:\n",
    "    print(\"No search results found\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "48_nf56d2EYV"
   },
   "source": [
    "No results! Unfortunately we don't have any near exact matches for \"flesh-eating bad guy\". Because we don't have any more specific information about the exact phrasing in the Elasticsearch data, we'll need to cast our search net wider.\n",
    "\n",
    "### Simple `multi_match` query\n",
    "\n",
    "This lexical query performs a standard keyword search for the term \"crime\" within the \"plot\" and \"genre\" fields of our Elasticsearch documents."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 22,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/"
    },
    "id": "XkAlzHW82EYV",
    "outputId": "f01c6ca5-f2fd-4cba-f5bf-7b8f6914eef6"
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Title: The Godfather\n",
      "Plot: An organized crime dynasty's aging patriarch transfers control of his clandestine empire to his reluctant son.\n",
      "\n",
      "Title: Goodfellas\n",
      "Plot: The story of Henry Hill and his life in the mob, covering his relationship with his wife Karen Hill and his mob partners Jimmy Conway and Tommy DeVito in the Italian-American crime syndicate.\n",
      "\n",
      "Title: The Silence of the Lambs\n",
      "Plot: A young F.B.I. cadet must receive the help of an incarcerated and manipulative cannibal killer to help catch another serial killer, a madman who skins his victims.\n",
      "\n",
      "Title: Pulp Fiction\n",
      "Plot: The lives of two mob hitmen, a boxer, a gangster and his wife, and a pair of diner bandits intertwine in four tales of violence and redemption.\n",
      "\n",
      "Title: Se7en\n",
      "Plot: Two detectives, a rookie and a veteran, hunt a serial killer who uses the seven deadly sins as his motives.\n",
      "\n",
      "Title: The Departed\n",
      "Plot: An undercover cop and a mole in the police attempt to identify each other while infiltrating an Irish gang in South Boston.\n",
      "\n",
      "Title: The Usual Suspects\n",
      "Plot: A sole survivor tells of the twisty events leading up to a horrific gun battle on a boat, which began when five criminals met at a seemingly random police lineup.\n",
      "\n",
      "Title: The Dark Knight\n",
      "Plot: When the menace known as the Joker wreaks havoc and chaos on the people of Gotham, Batman must accept one of the greatest psychological and physical tests of his ability to fight injustice.\n",
      "\n"
     ]
    }
   ],
   "source": [
    "resp = client.search(\n",
    "    index=\"movies\",\n",
    "    retriever={\n",
    "        \"standard\": {\n",
    "            \"query\": {\"multi_match\": {\"query\": \"crime\", \"fields\": [\"plot\", \"genre\"]}}\n",
    "        }\n",
    "    },\n",
    ")\n",
    "\n",
    "for hit in resp[\"hits\"][\"hits\"]:\n",
    "    title = hit[\"_source\"][\"title\"]\n",
    "    plot = hit[\"_source\"][\"plot\"]\n",
    "    print(f\"Title: {title}\\nPlot: {plot}\\n\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "2K1C_Pmzmrxw"
   },
   "source": [
    "That's better! At least we've got some results now. We broadened our search criteria to increase the chances of finding relevant results.\n",
    "\n",
    "But these results aren't very precise in the context of our original query \"flesh-eating bad guy\". We can see that \"The Silence of the Lambs\" is returned in the middle of the results set with this generic `match` query. Let's see if we can use our semantic reranking model to get closer to the searcher's original intent."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "Nd8gAE6H2EYV"
   },
   "source": [
    "## Semantic reranker\n",
    "\n",
    "In the following `retriever` syntax, we wrap our standard query retriever in a `text_similarity_reranker`. This allows us to leverage the NLP model we deployed to Elasticsearch to rerank the results based on the phrase \"flesh-eating bad guy\"."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 23,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/"
    },
    "id": "Z9DegKqb2EYV",
    "outputId": "fc3f8827-4ddd-46d0-cccd-5376e0a81bbf"
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Title: The Silence of the Lambs\n",
      "Plot: A young F.B.I. cadet must receive the help of an incarcerated and manipulative cannibal killer to help catch another serial killer, a madman who skins his victims.\n",
      "\n",
      "Title: Pulp Fiction\n",
      "Plot: The lives of two mob hitmen, a boxer, a gangster and his wife, and a pair of diner bandits intertwine in four tales of violence and redemption.\n",
      "\n",
      "Title: Se7en\n",
      "Plot: Two detectives, a rookie and a veteran, hunt a serial killer who uses the seven deadly sins as his motives.\n",
      "\n",
      "Title: Goodfellas\n",
      "Plot: The story of Henry Hill and his life in the mob, covering his relationship with his wife Karen Hill and his mob partners Jimmy Conway and Tommy DeVito in the Italian-American crime syndicate.\n",
      "\n",
      "Title: The Dark Knight\n",
      "Plot: When the menace known as the Joker wreaks havoc and chaos on the people of Gotham, Batman must accept one of the greatest psychological and physical tests of his ability to fight injustice.\n",
      "\n",
      "Title: The Godfather\n",
      "Plot: An organized crime dynasty's aging patriarch transfers control of his clandestine empire to his reluctant son.\n",
      "\n",
      "Title: The Departed\n",
      "Plot: An undercover cop and a mole in the police attempt to identify each other while infiltrating an Irish gang in South Boston.\n",
      "\n",
      "Title: The Usual Suspects\n",
      "Plot: A sole survivor tells of the twisty events leading up to a horrific gun battle on a boat, which began when five criminals met at a seemingly random police lineup.\n",
      "\n"
     ]
    }
   ],
   "source": [
    "resp = client.search(\n",
    "    index=\"movies\",\n",
    "    retriever={\n",
    "        \"text_similarity_reranker\": {\n",
    "            \"retriever\": {\n",
    "                \"standard\": {\n",
    "                    \"query\": {\n",
    "                        \"multi_match\": {\"query\": \"crime\", \"fields\": [\"plot\", \"genre\"]}\n",
    "                    }\n",
    "                }\n",
    "            },\n",
    "            \"field\": \"plot\",\n",
    "            \"inference_id\": \"my-msmarco-minilm-model\",\n",
    "            \"inference_text\": \"flesh-eating bad guy\",\n",
    "        }\n",
    "    },\n",
    ")\n",
    "\n",
    "for hit in resp[\"hits\"][\"hits\"]:\n",
    "    title = hit[\"_source\"][\"title\"]\n",
    "    plot = hit[\"_source\"][\"plot\"]\n",
    "    print(f\"Title: {title}\\nPlot: {plot}\\n\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "LqVh3qt82EYW"
   },
   "source": [
    "Success! \"The Silence of the Lambs\" is our top result. Semantic reranking helped us find the most relevant result by parsing a natural language query, overcoming the limitations of lexical search which relies more on exact matching.\n",
    "\n",
    "Semantic reranking enables semantic search in a few steps, without the need for generating and storing embeddings. Being able to use open source models hosted on Hugging Face natively in your Elasticsearch cluster is great for prototyping, testing, and building search experiences."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "6yYfJjjstxwK"
   },
   "source": [
    "## Learn more\n",
    "\n",
    "- For this example we've chosen the [`cross-encoder/ms-marco-MiniLM-L-6-v2`](https://huggingface.co/cross-encoder/ms-marco-MiniLM-L-6-v2) text similarity model. Refer to [the Elastic NLP model reference](https://www.elastic.co/guide/en/machine-learning/8.15/ml-nlp-model-ref.html#ml-nlp-model-ref-text-similarity) for a list of third-party text similarity models supported by Elasticsearch.\n",
    "- Learn more about [integrating Hugging Face](https://www.elastic.co/search-labs/integrations/hugging-face) with Elasticsearch.\n",
    "- Check out Elastic's catalogue of Python notebooks in the [`elasticsearch-labs` repo](https://github.com/elastic/elasticsearch-labs/tree/main/notebooks).\n",
    "- Learn more about [retrievers and reranking in Elasticsearch](https://www.elastic.co/guide/en/elasticsearch/reference/current/retrievers-reranking-overview.html)"
   ]
  }
 ],
 "metadata": {
  "colab": {
   "provenance": []
  },
  "kernelspec": {
   "display_name": "Python 3",
   "name": "python3"
  },
  "language_info": {
   "name": "python"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 0
}
